Git is simply a version control system, with the original purpose to help developers work together on software projects. It has it’s own coding structure but for the purpose of this handout, we are going to be using the easy version.
This is done by using repositories or repos which are basically highly structured super powered team drive. The data science world has taken it a new level because of that open-science thingy.
Simply put, it allows you to functionally keep control of multiple edits while working with other people
Exposure If someone wants your code, they can just clone it from your Github.
Help each other Let’s say I want to advance someone else’s code. I can fork a feature or fix bugs for them. Github will automatically notify the author and they can use keep or reject edits.
Because it’s sloppy, nobody will hire you, and nobody likes having fifteen files called Version 1, version 2 final Version, The actual final version.
We develop new materials to make slightly easier, for example:why do we use gmail instead of writing letters and putting money on the letter and sending each other. Technology advances and people who utilize it have more time for fun. That’s right, science = fun.
GitHub is a website that hosts your data, but it also has a software component that is very powerful and can be accessed directly through your code in R. So in a way yes, Github is like DropBox or Google Drive.
There are plenty of alternatives to GitHub, but we are choosing to use this one because they have seemingly dominated the Git Field.
Github.com and go to their website and make an account, while you are there download the software client for it.
A repository is essentially a folder for a single project. I am going to be creating a Github project called IntroToGit which is a Rmarkdown file in which I explain how to use Github for a Regression class at Humboldt State University.
We start this by selecting new repository and just selecting a path or folder for you.
Nice! Fill your repository with nice and shiny things. At some point, you’ll want to publish the repository so it can go up on your account on the GitHub website.
For example, my repository consists of the rmarkdown file, the html file for it, and some screenshots for you to follow along.
Actually, exactly this picture below.
Before you can publish you need to do three things.
Look at that, I added a summary and description, I’m already better than half the packages on github
Once you have published, the whole world can see your repository!
Alright, so Let’s talk about branches * These are the aspects for collaboration * Instead of always working in one specific master file and messing that up, we can work in branches and discuss whether or not we want changes.
Don’t tempt me Ben Stiller, I’ll do it
Make sure you are in the proper branch before experimenting.
So here we are in the Gif Branch. I want to add a gif to the title because I think it’s a great idea.
* This is an example of something you want to discuss before putting it in the final version
Green plus signs indicate something was added, The orange circle indicate something was edited – you can drill down to see more
The red lines indicate removal of a line, while green lines indicate addition.
GitHub will tell you if you did it right.
I’d recommend deleting useless branches that have been incorporated because they will just mess you up in the future, plus it’s redundant.
Forking a repository can allow for easy collaboration
GitHub (or other versions of git) allow for version control, open science, and encourage collaboration
Lots of jargon associated with it
Fully integrated into RStudio
Good on you, it’s kinda difficult to learn but the best way to learn is just by doing. I’d recommend starting by just putting a small project on Github and editing a small change to see what happens.